Some housekeeping (again), installing necessary packages.
list.of.packages <- c("igraph", "tidygraph", "ggraph")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
rm(list.of.packages, new.packages)
So, before we talk about networks, one thing upfront… why should we? I mean, they undeniably look pretty, don’t they?
Somehow, the visualization of networks fascinates the human mind (find a short TED talk on networks and how they depict our world here), and has even inspired an own art movement, networkism (see some examples here).
Yet, besides that, is there an analytical value for a data scientist to bother about networks?
There are a number of applications designed for network analysis and the creation of network graphs such as gephi and cytoscape. Though not specifically designed for it, R has developed into a powerful tool for network analysis.
Significant network analysis packages for R include the network, sna, and igraph package. In addition, Thomas Lin Pedersen has recently released the tidygraph package that leverage the power of igraph in a manner consistent with the tidyverse workflow. Even better, he tops it up with ggraph, a consistent ´ggplot2´-look-and-feel network visualization package.
R can also be used to make interactive network graphs with the htmlwidgets framework that translates R code to JavaScript. Cool implementations thereof are the vizNetwork and networkD3 packages.
As analytical tool, I will in this lab mostly use igraph. In terms of functions, it is pretty much equivalent to network, yet slightly more powerful, better integrated, and maintained. Since both packages have many of the same functions, better don’t load them both at once.
First of all, what is a network? Plainly speaking, a network is a system of elements which are connected by some relationship. The vocabulary can be a bit technical and even inconsistent between different disciplines, packages, and software. The whole system is (surprise, surprise) usually called a network or graph. The elements are commonly referred to as nodes (system theory jargon) or vertices (graph theory jargon) of a graph, while the connections are edges or links. I will mostly refer to the elements as nodes, and their connections as edges.
Generally, networks are a form of representing relational data. This is a very general tool that can be applied to many different types of relationships between all kind of elements. The content, meaning, and interpretation for sure depends on what elements we display, and which types of relationships. For example:
The possibilities to depict relational data are manifold. For example:
Note: Content matters! Each relation yields a different structure & has different effects. Theories might make sense on inter-personal, but not inter-organizational or non-social context.
MOst real world relational data is to be found in what we call an edge list, a dataframe that contains a minimum of two columns, one column of nodes that are the source of a connection and another column of nodes that are the target of the connection. The nodes in the data are identified by unique IDs. If the distinction between source and target is meaningful, the network is directed. If the distinction is not meaningful, the network is undirected (more on that later). So, every row that contains the ID of one element in column 1, and the ID of another element in column 2 indicates that a connection between them exists. An edge list can also contain additional columns that describe attributes of the edges such as a magnitude aspect for an edge. If the edges have a magnitude attribute the graph is considered weighted (more on that later). Below an example ofa minimal edge list created with the tibble() function.
edge_list <- tibble(from = c(1, 2, 2, 3, 4), to = c(2, 3, 4, 2, 1))
edge_list
Sometimes it is preferable to also create a separate node list. At its simplest, a node list is a data frame with a single column - which I will label as “id” - that lists the node IDs found in the edge list. The advantage of creating a separate node list is the ability to add attribute columns to the data frame such as the names of the nodes or any kind of groupings.
library(tidyverse)
node_list <- tibble(id = 1:4, group = sample(letters[1:2], 4, replace = TRUE))
node_list
A second popular form of network representation is the adjacency-matrix (also called socio-matrix). It is represented as a \(n*n\) matrix, where \(n\) stands for the number of elements of which their relationships should be represented. The value in the cell that intercepts row \(n\) and column \(m\) indicates if an edge is present (=1) or absent (=0).
Tip: Given an edgelist, an adjacency matrix can easily be produced by crosstabulating:
adj_matrix <- table(edge_list) %>% as.matrix()
adj_matrix
to
from 1 2 3 4
1 0 1 0 0
2 0 0 1 1
3 0 1 0 0
4 1 0 0 0
igraphTo create an igraph object from an edge-list data frame we can use the graph_from_data_frame() function, which is a bit more straight forward than network(). There are three arguments in the graph_from_data_frame() function: d, vertices, and directed. Here, d refers to the edge list, vertices to the node list, and directed can be either TRUE or FALSE depending on whether the data is directed or undirected. By default, graph.data.frame() treats the first two columns of the edge list and any remaining columns as edge attributes.
library(igraph)
g <- graph_from_data_frame(d = edge_list, vertices = node_list, directed = FALSE)
g
IGRAPH d85cc1e UN-- 4 5 --
+ attr: name (v/c), group (v/c)
+ edges from d85cc1e (vertex names):
[1] 1--2 2--3 2--4 2--3 1--4
Lets inspect the resulting object. An igraph graph object summary reveals some interesting informations.
UN, or directed DNattr: name (v/c))n--m indicates an undirected, n->m an directed edge.Lets take a look at the structure of the object:
glimpse(g[[1]])
List of 1
$ 1: 'igraph.vs' Named int [1:2] 2 4
..- attr(*, "names")= chr [1:2] "2" "4"
..- attr(*, "env")=<weakref>
..- attr(*, "graph")= chr "d85cc1e0-c674-11e8-816a-b7a131578cf9"
We see, the object has a list-format, consisting of sepperate lists for every node, containing some attributes which are irrelevant now, and an edgelist for every node, capturing its ego-network (eg., .. ..- attr(*, "names")= chr [1:2] "2" "4")
We can also plot it to take a look. igraph object can be directly used with the plot() function. The results can be adjusted with a set of parameters we will discover later. It’s not super pretty, therefore we will later also explore more powerfull plotting tools for rgaphs. However, its quick&dirty, so lets take it like that for now.
plot(g)
Yeah, that’s the graph. We We can also use the adjacency matrix to create the same graph.
g <- graph_from_adjacency_matrix(adj_matrix, mode = "undirected")
g
IGRAPH fcf6ff5 UN-- 4 4 --
+ attr: name (v/c)
+ edges from fcf6ff5 (vertex names):
[1] 1--2 1--4 2--3 2--4
We can inspect and manipulate the nodes via V(g) (V for vertices, its graph-theory slang), and edges with E(g)
V(g)
+ 4/4 vertices, named, from fcf6ff5:
[1] 1 2 3 4
E(g)
+ 4/4 edges from fcf6ff5 (vertex names):
[1] 1--2 1--4 2--3 2--4
We can also use most of the base-R slicing&dicing.
V(g)[1:3]
+ 3/4 vertices, named, from fcf6ff5:
[1] 1 2 3
E(g)[2:4]
+ 3/4 edges from fcf6ff5 (vertex names):
[1] 1--4 2--3 2--4
Remember, it’s a list-object. So, if we just want to have the values, we have to use the double bracket [[x]].
V(g)[[1:3]]
+ 3/4 vertices, named, from fcf6ff5:
We can also use the $ notation.
V(g)$name
[1] "1" "2" "3" "4"
rm(list=ls())
files <- list.files(path ="../data/GoT/", full.names = TRUE)
files
[1] "../data/GoT/asoiaf-all-edges.csv" "../data/GoT/asoiaf-all-nodes.csv"
[3] "../data/GoT/asoiaf-book1-edges.csv" "../data/GoT/asoiaf-book1-nodes.csv"
[5] "../data/GoT/asoiaf-book2-edges.csv" "../data/GoT/asoiaf-book2-nodes.csv"
[7] "../data/GoT/asoiaf-book3-edges.csv" "../data/GoT/asoiaf-book3-nodes.csv"
[9] "../data/GoT/asoiaf-book4-edges.csv" "../data/GoT/asoiaf-book4-nodes.csv"
[11] "../data/GoT/asoiaf-book45-edges.csv" "../data/GoT/asoiaf-book45-nodes.csv"
[13] "../data/GoT/asoiaf-book5-edges.csv" "../data/GoT/asoiaf-book5-nodes.csv"
[15] "../data/GoT/union_characters.RDS" "../data/GoT/union_edges.RDS"
edges.cooc.all <- fread(files[1], data.table = FALSE)
head(edges.cooc.all)
So, that’s what we have, a classical edgelist, with id1 in column 1 and id2 in column2. Note, the edges are in this case weighted.I don’t like the sepperating “-” between in the names, lets get rid of them.
colnames(edges.cooc.all) <- tolower(colnames(edges.cooc.all))
edges.cooc.all %<>%
mutate(source = gsub("-", " ", source),
target = gsub("-", " ", target))
Ok, lets see how many characters we have overal.
edges.cooc.all %>%
select(-type) %>%
gather(x, name, source:target) %>%
n_distinct(.$name)
[1] 5646
chars.main <- edges.cooc.all %>%
select(-type) %>%
gather(x, name, source:target) %>%
group_by(name) %>%
summarise(sum_weight = sum(weight)) %>%
ungroup() %>%
arrange(desc(sum_weight)) %>%
top_n(50)
head(chars.main)
So far so good, if we only go by edge weights, Tyrion is going to make it…. my favorite anyhow…
However, lets reduce our edgelist to this main characters, just to warm up and keep the overview.
edges.cooc <- edges.cooc.all %>%
filter(source %in% chars.main$name & target %in% chars.main$name) %>%
select(source, target, weight)
g <- graph_from_data_frame(d = edges.cooc, directed = FALSE)
g
IGRAPH 10c1de6 UNW- 50 402 --
+ attr: name (v/c), weight (e/n)
+ edges from 10c1de6 (vertex names):
[1] Aemon Targaryen (Maester Aemon)--Grenn Aemon Targaryen (Maester Aemon)--Jeor Mormont
[3] Aemon Targaryen (Maester Aemon)--Jon Snow Aemon Targaryen (Maester Aemon)--Mance Rayder
[5] Aemon Targaryen (Maester Aemon)--Robert Baratheon Aemon Targaryen (Maester Aemon)--Samwell Tarly
[7] Aemon Targaryen (Maester Aemon)--Stannis Baratheon Arya Stark --Bran Stark
[9] Arya Stark --Catelyn Stark Arya Stark --Cersei Lannister
[11] Arya Stark --Eddard Stark Arya Stark --Gregor Clegane
[13] Arya Stark --Ilyn Payne Arya Stark --Jaime Lannister
[15] Arya Stark --Joffrey Baratheon Arya Stark --Jon Snow
+ ... omitted several edges
Note that this co-occurence network is weighted (number of co-occurence), and undirected
is_weighted(g)
[1] TRUE
is_directed(g)
[1] FALSE
We already know from the summary, but we can also count the number of nodes and edges as follows:
# Count number of edges
gsize(g)
[1] 402
# Count number of vertices
gorder(g)
[1] 50
We can give the graph a first plot to see what happens there. It’s not pretty, but we will fine-tune it later
plot(g)
We already see that some nodes are not connected (isolated), so lets drop them for our network analysis.
g <- delete_edges(g, E(g)[weight < 20])
g <- delete_vertices(g, degree(g) == 0)
# Find all edges that include "Britt"
E(g)[[inc('Daenerys Targaryen')]]
+ 6/189 edges from 17b0991 (vertex names):
# Find all pairs that spend 4 or more hours together per week
E(g)[[weight >= 150]]
+ 11/189 edges from 17b0991 (vertex names):
hist(E(g)$weight)
lets see who is the most central figure in this network of interactions
degree(g)
Aemon Targaryen (Maester Aemon) Arya Stark Barristan Selmy
3 9 5
Bran Stark Brienne of Tarth Bronn
14 4 2
Catelyn Stark Cersei Lannister Daenerys Targaryen
16 20 6
Davos Seaworth Drogo Eddard Stark
2 2 18
Edmure Tully Gregor Clegane Grenn
3 4 2
Hizdahr zo Loraq Hodor Ilyn Payne
2 4 4
Jaime Lannister Jeor Mormont Joffrey Baratheon
16 4 19
Jojen Reed Jon Snow Jorah Mormont
3 13 3
Loras Tyrell Luwin Lysa Arryn
6 6 4
Mance Rayder Margaery Tyrell Meera Reed
1 5 3
Melisandre Meryn Trant Myrcella Baratheon
3 3 3
Petyr Baelish Pycelle Renly Baratheon
10 6 10
Rickon Stark Robb Stark Robert Baratheon
6 17 16
Rodrik Cassel Samwell Tarly Sandor Clegane
5 5 6
Sansa Stark Stannis Baratheon Theon Greyjoy
17 13 6
Tommen Baratheon Tyrion Lannister Tywin Lannister
6 25 9
Quentyn Martell Varys
2 7
which.max(degree(g))
Tyrion Lannister
47
strength(g)
Aemon Targaryen (Maester Aemon) Arya Stark Barristan Selmy
234 512 226
Bran Stark Brienne of Tarth Bronn
1206 225 160
Catelyn Stark Cersei Lannister Daenerys Targaryen
766 1438 547
Davos Seaworth Drogo Eddard Stark
195 151 1175
Edmure Tully Gregor Clegane Grenn
128 102 121
Hizdahr zo Loraq Hodor Ilyn Payne
138 333 112
Jaime Lannister Jeor Mormont Joffrey Baratheon
862 277 1343
Jojen Reed Jon Snow Jorah Mormont
223 1238 216
Loras Tyrell Luwin Lysa Arryn
167 238 179
Mance Rayder Margaery Tyrell Meera Reed
112 258 255
Melisandre Meryn Trant Myrcella Baratheon
208 102 87
Petyr Baelish Pycelle Renly Baratheon
477 193 448
Rickon Stark Robb Stark Robert Baratheon
263 966 1091
Rodrik Cassel Samwell Tarly Sandor Clegane
137 472 259
Sansa Stark Stannis Baratheon Theon Greyjoy
1059 771 239
Tommen Baratheon Tyrion Lannister Tywin Lannister
337 1694 392
Quentyn Martell Varys
80 400
which.max(strength(g))
Tyrion Lannister
47
neighbors(g, 'Robert Baratheon')
+ 16/50 vertices, named, from 17b0991:
[1] Barristan Selmy Catelyn Stark Cersei Lannister Daenerys Targaryen Eddard Stark
[6] Jaime Lannister Joffrey Baratheon Jon Snow Petyr Baelish Pycelle
[11] Renly Baratheon Sansa Stark Stannis Baratheon Tyrion Lannister Tywin Lannister
[16] Varys
ego network
ego(g, 2, "Drogo")[[1]]
+ 8/50 vertices, named, from 17b0991:
[1] Drogo Daenerys Targaryen Jorah Mormont Barristan Selmy Hizdahr zo Loraq Robert Baratheon
[7] Quentyn Martell Tyrion Lannister
g.drogo <- make_ego_graph(g, 2, nodes = "Drogo")[[1]]
g.danny <- make_ego_graph(g, 2, nodes = "Daenerys Targaryen")[[1]]
plot(g.drogo)
plot(g.danny)
Btw: To merge two graphs, just do:
g.merge
IGRAPH 45a00e4 UN-- 21 82 --
+ attr: name (v/c), weight_1 (e/n), weight_2 (e/n)
+ edges from 45a00e4 (vertex names):
[1] Stannis Baratheon--Tywin Lannister Renly Baratheon --Stannis Baratheon Pycelle --Varys
[4] Petyr Baelish --Varys Petyr Baelish --Sansa Stark Petyr Baelish --Pycelle
[7] Jon Snow --Stannis Baratheon Joffrey Baratheon--Varys Joffrey Baratheon--Tywin Lannister
[10] Joffrey Baratheon--Stannis Baratheon Joffrey Baratheon--Sansa Stark Joffrey Baratheon--Renly Baratheon
[13] Joffrey Baratheon--Petyr Baelish Jaime Lannister --Tywin Lannister Jaime Lannister --Sansa Stark
[16] Jaime Lannister --Renly Baratheon Jaime Lannister --Joffrey Baratheon Eddard Stark --Varys
[19] Eddard Stark --Stannis Baratheon Eddard Stark --Sansa Stark Eddard Stark --Renly Baratheon
[22] Eddard Stark --Pycelle Eddard Stark --Petyr Baelish Eddard Stark --Jon Snow
+ ... omitted several edges
betweenness(g)
Aemon Targaryen (Maester Aemon) Arya Stark Barristan Selmy
0.0 19.0 93.0
Bran Stark Brienne of Tarth Bronn
17.5 0.0 0.0
Catelyn Stark Cersei Lannister Daenerys Targaryen
94.0 96.5 0.0
Davos Seaworth Drogo Eddard Stark
0.0 0.0 138.0
Edmure Tully Gregor Clegane Grenn
0.0 15.0 0.0
Hizdahr zo Loraq Hodor Ilyn Payne
0.0 85.0 8.5
Jaime Lannister Jeor Mormont Joffrey Baratheon
67.5 57.0 63.0
Jojen Reed Jon Snow Jorah Mormont
0.0 116.0 47.5
Loras Tyrell Luwin Lysa Arryn
57.5 133.0 0.0
Mance Rayder Margaery Tyrell Meera Reed
0.0 0.0 0.0
Melisandre Meryn Trant Myrcella Baratheon
19.0 0.0 1.0
Petyr Baelish Pycelle Renly Baratheon
7.0 25.0 31.0
Rickon Stark Robb Stark Robert Baratheon
73.0 98.5 155.5
Rodrik Cassel Samwell Tarly Sandor Clegane
25.0 20.0 13.0
Sansa Stark Stannis Baratheon Theon Greyjoy
137.0 103.0 30.0
Tommen Baratheon Tyrion Lannister Tywin Lannister
0.0 309.0 24.0
Quentyn Martell Varys
0.0 0.0
eigen_centrality(g, scale = TRUE)$vector %>% round(3)
Aemon Targaryen (Maester Aemon) Arya Stark Barristan Selmy
0.064 0.332 0.073
Bran Stark Brienne of Tarth Bronn
0.297 0.126 0.183
Catelyn Stark Cersei Lannister Daenerys Targaryen
0.415 0.952 0.039
Davos Seaworth Drogo Eddard Stark
0.060 0.007 0.754
Edmure Tully Gregor Clegane Grenn
0.067 0.072 0.035
Hizdahr zo Loraq Hodor Ilyn Payne
0.008 0.079 0.086
Jaime Lannister Jeor Mormont Joffrey Baratheon
0.547 0.113 0.922
Jojen Reed Jon Snow Jorah Mormont
0.046 0.364 0.038
Loras Tyrell Luwin Lysa Arryn
0.112 0.061 0.121
Mance Rayder Margaery Tyrell Meera Reed
0.047 0.200 0.051
Melisandre Meryn Trant Myrcella Baratheon
0.065 0.089 0.067
Petyr Baelish Pycelle Renly Baratheon
0.353 0.156 0.267
Rickon Stark Robb Stark Robert Baratheon
0.098 0.428 0.755
Rodrik Cassel Samwell Tarly Sandor Clegane
0.044 0.127 0.186
Sansa Stark Stannis Baratheon Theon Greyjoy
0.723 0.367 0.091
Tommen Baratheon Tyrion Lannister Tywin Lannister
0.270 1.000 0.333
Quentyn Martell Varys
0.004 0.345
edge_density(g)
[1] 0.1542857
diameter(g, directed = F, weights = NA)
[1] 4
transitivity(g)
[1] 0.4552807
mean_distance(g, directed = F)
[1] 2.337959
Ahh, you saw it comming, right? What about you explore the GoT network a bit on your own HERE. Lets see how that works out!
So far so good, up to now we considered undirected networks, constructed by the amount characters co-occur. However, as you already might guess, that’s not where we stop.
There are also other relationships to which we can, and sometimes have to, assign a directionality. An obvious example here are family ties. Here, I will ose the nicely compiled dataset of the wonderful Shirin that can be found here. It contains a nodelist with house-affiliations and furtehr characteristics of main characters, and a edgelist of their family relationships.
rm(chars.main, g, g.danny, g.drogo, g.merge)
edges.fam <- readRDS("../data/GoT/union_edges.RDS")
nodes.fam <- readRDS("../data/GoT/union_characters.RDS")
head(nodes.fam)
head(edges.fam)
g <- graph_from_data_frame(edges.fam,
vertices = nodes.fam,
directed = TRUE)
g
IGRAPH 813c92b DN-- 208 404 --
+ attr: name (v/c), male (v/n), culture (v/c), house (v/c), popularity (v/n), house2 (v/c), color (v/c),
| shape (v/c), type (e/c), color (e/c), lty (e/c)
+ edges from 813c92b (vertex names):
[1] Lysa Arryn ->Robert Arryn Jasper Arryn ->Alys Arryn
[3] Jasper Arryn ->Jon Arryn Jon Arryn ->Robert Arryn
[5] Cersei Lannister ->Tommen Baratheon Cersei Lannister ->Joffrey Baratheon
[7] Cassana Baratheon->Stannis Baratheon Cersei Lannister ->Myrcella Baratheon
[9] Selyse Florent ->Shireen Baratheon Cassana Baratheon->Renly Baratheon
[11] Rhaelle Targaryen->Steffon Baratheon Cassana Baratheon->Robert Baratheon
[13] Robert Baratheon ->Tommen Baratheon Robert Baratheon ->Joffrey Baratheon
+ ... omitted several edges
plot(g)
For plotting the legend, I am summarizing the edge and node colors.
color_vertices <- nodes.fam %>%
group_by(house, color) %>%
summarise(n = n()) %>%
filter(!is.na(color))
colors_edges <- edges.fam %>%
group_by(type, color) %>%
summarise(n = n()) %>%
filter(!is.na(color))
plot(g,
layout = layout_with_fr(g),
vertex.label = gsub(" ", "\n", V(g)$name),
vertex.shape = V(g)$shape,
vertex.color = V(g)$color,
vertex.size = (V(g)$popularity + 0.5) * 5,
vertex.frame.color = "gray",
vertex.label.color = "black",
vertex.label.cex = 0.8,
edge.arrow.size = 0.5,
edge.color = E(g)$color,
edge.lty = E(g)$lty)
legend("topleft", legend = c(NA, "Node color:", as.character(color_vertices$house), NA, "Edge color:", as.character(colors_edges$type)), pch = 10,
col = c(NA, NA, color_vertices$color, NA, NA, colors_edges$color), pt.cex = 3, cex = 2, bty = "n", ncol = 1,
title = "")
legend("topleft", legend = "", cex = 3, bty = "n", ncol = 1,
title = "Game of Thrones Family Ties")
You might have already guessed, we can very well also do a clustering exercise in networks. Here, we do not cluster nodes according to their similarity in attributes, but according to their connectivity. There are plenty of algorithms,, and we will explore further ones lateron. Most of them aim to find communities with maximum within/connectivity, and minimum between/connectivity.
However, most of them are not designed to work with directed networks. Therefore, we will convert our nice network for now to an undirected one.
g.ud <- as.undirected(g)
First, we will give it a try with the edge-betweenness algorithm (Newman-Girvan). Here, high-betweenness edges are removed sequentially (recalculating at each step) and the best partitioning of the network is selected.
Lets take a look how it works.
And we now run it. As an hirarchical community detection technique. Since it is an hirarchical one, we can again plot a dendogram which we already know from the hirarchical clustering
ceb
IGRAPH clustering edge betweenness, groups: 12, mod: 0.84
+ groups:
$`1`
[1] "Alys Arryn" "Elys Waynwood" "Jasper Arryn" "Jeyne Royce" "Jon Arryn"
[6] "Lysa Arryn" "Robert Arryn" "Rowena Arryn" "Edmure Tully" "Sansa Stark"
[11] "Arya Stark" "Bran Stark" "Catelyn Stark" "Eddard Stark" "Jeyne Westerling"
[16] "Rickon Stark" "Robb Stark" "Talisa Stark" "Hoster Tully" "Minisa Whent"
[21] "Petyr Baelish"
$`2`
[1] "Cassana Baratheon" "Cersei Lannister" "Jaime Lannister" "Joffrey Baratheon" "Margaery Tyrell"
[6] "Myrcella Baratheon" "Renly Baratheon" "Robert Baratheon" "Selyse Florent" "Shireen Baratheon"
+ ... omitted several groups/vertices
plot(ceb, g.ud,
vertex.frame.color = V(g)$color, # load the predefined color of the nodes (houses)
vertex.size = (V(g)$popularity + 0.5) * 5 # define node-size by popularity)
)
Lets only see how good it performs on the major houses, the rest is too small anyhow
bind_cols(com = ceb$membership, house = V(g.ud)$house) %>%
group_by(house) %>%
filter(n() >= 10) %>%
ungroup() %>%
table()
house
com House Baratheon House Frey House Greyjoy House Lannister House Martell House Stark House Targaryen House Tyrell
1 0 0 0 0 0 7 0 0
2 9 0 0 4 0 0 0 1
3 1 0 0 0 2 0 13 0
4 0 13 0 0 0 0 0 0
5 0 7 0 9 0 0 0 0
6 0 0 14 0 0 0 0 0
7 0 0 0 13 0 0 0 0
8 0 0 0 0 11 0 0 0
9 0 0 0 0 0 6 0 0
10 0 0 0 0 0 13 0 0
11 0 0 0 0 0 0 0 9
12 0 0 0 0 0 0 0 5
We see, indeed, that the communities for the most part capture the affiliation to the great houses.
AGain, its time to have some fun on your own. HERE you will find another kaggle notebook where you can demonstrate your network analysis skills even more!